range 1.7 < z < 4.1; b.) for redshift range z < 2.51 and c.) for z > 2.51. k is in 
unit h Mpc -1 . The error bars are obtained from the average over the samples 
of QSO's absorption spectrum. For clarity, the points P var (k) 1 are plotted at 
log A; + 0.05. 

Figure 6 The same as Figure 5 for sample of JB Lya forest with width > 0.32 A. 

Figure 7 The same as Figure 5 for sample of JB Lya forest with width > 0.16 A. 

Figure 8 Reconstruction of 3-D spectra given by P(k)j (diamond) and P var (k) 1 
(star) of LWT Lya forest samples with width > 0.36 A. For clarity, the points 
reconstructed by variance are plotted at log k + 0.05. The gray and dark bands 
are the linear spectra of the SCDM and CHDM, respectively. 

Figure 9 The same as Figure 8 for a.) W > 0.32 A and b.) W > 0.16 A of JB Lya 
forest samples. 
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Figure captions 

Figure 1 Reconstruction of a power law spectrum P(k) = k~ 2 by a.) FFT, and b.) 
wavelet SSD, where k = 27m/ L is the wavenumber in a length unit. The points 
of log 2 P var (star) in lb have been shifted down to log 2 Pj ar — 1 for clarity of 
presentation only. The slopes of the lines log 2 Pj — j (diamond) or log 2 Pj ar — j 
are -2 + 1 = -1. 

Figure 2 Reconstruction of spectrum with a typical scale eq.(37). k = 2Trn/L is the 
wavenumber in a length unit. The samples in a, b and c are produced from 
spectrum (36) on L with the bin numbers of 256, 512 and 1024, and the bin size 
is 2tt length units. The peak of the spectrum is at log k = -1.37. 

Figure 3 Reconstruction of 3-D spectrum from simulated Lya forests samples of the 
SCDM model: a.) for lines with width larger than 0.16 A and b.) for lines 
without width selection. The data and 1 a error bar are found from the average 
over 20 simulated samples, k is in units h Mpc -1 . The gray band is the linear 
spectra of the SCDM. The center line of the gray bands is the power spectrum 
at z = 2.8, and the lower and upper edges are the spectra at z = 4.1 and 1.7, 
respectively. 

Figure 4 Reconstruction of 3-D spectrum from simulated Lya forests samples with 
width larger than 0.16 A in the CHDM model. The gray band is the linear 
spectra of the CHDM. The center line of the gray bands is the power spectrum 
at z = 2.8, and the lower and upper edges are the spectra at z = 4.1 and 1.7, 
respectively. 

Figure 5 1-D spectra P(k)j (diamond) and P var (k) 1 (star) of LWT Lya forest sam- 
ples with width > 0.36 A. a.) is the spectrum given by data in the entire redshift 
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This eq.(17). An alternative form, which uses the Fourier transform of the basic 
function ip(x) rather than ipjj(x), can be derived from eq.(A20) as follows 
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^From 8jji, j' should be equal to j, and then /' < 2 J . Therefore, £/+2j m ",/' requires 
ra" = and / = /'. We have then the Parseval theorem (9). 

A. 3 Derivation of eqs.(17) and (18) 



Substituting the wavelet expansion of 8(x) (eq.(A6) into eq.(3), we have 
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Therefore, this term is the mean density p. Because all basis functions ipjj(x)dx = 
0, one can find finally from eq.(A6) 

P j=o ;=-oo 

where the father function coefficients 6 J: i are given by eq.(A8) or eq.(8). 
A. 2 Parseval theorem 

^From eq.(8) or (A8), we have 
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here we make a change of variable x' = x — 2~ J KL. Therefore, when 2~ J K is an 

integer, i.e. K = 2 J m, eq.(A15) is 

/•oo /2A 1/2 
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which shows that the FFC, Sjj, are periodic in / with period 2 J . 

We now show the Parseval theorem. From the expansion (A14) we have 
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where p is the mean density, and the coefficients are calculated by the inner products 
p(x)(f> m (x)dx (A7) 

8 h i = I 8(x)ip h i(x)dx (A8) 



■oo 

oo 



where 6{x) = (p(x) — p)/p is the density contrast. 

By definition, p(x) = p(x + mL) for integers m, eq.(A8) can be rewritten as 



p(x + mL)(f) m (x)dx (A9) 
Using eq.(A4), we have then 
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where x l = x + mL. Therefore coefficients c m are independent of m. Considering the 
property of "partition of unity" of the scaling function (Daubechies 1992) 
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one has from eq.(AlO) 
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The first term in the expansion (A6) becomes 
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^ c m cf) m (x) = c L~ 1/2 ^ (f)(x / L + m) = L' 1 p(x)dx. (A13) 
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Appendix 



A.l Wavelet decomposition of S(x) 

Compactly supported discrete wavelets can be constructed from the scaling func- 
tion <f>(r)), which is a solution of the recursive equation (Daubechies 1992, Meyer 1993) 

M) = £^(2>7-0 (Al) 

where / is an integer. If the coefficient a; are real and satisfy the conditions: J2i a i a i + 
2m = 26 0jfn and J2i a i = 2, the solution <f>(x) will be orthogonal to integer translates, 
i.e. one has 



<f>(r) - m)<f)(r) - l)drj = 6 m j (A2) 

The basic wavelet ip(r)) is defined as 

V'(^) = E(- 1 )W^ + (A3) 

The variable rj is dimensionless. 

To conduct a wavelet expansion, one constructs the bases by dilation and trans- 
lation of <f>(x) and ip(x) as 

<f) m (x) = L- 1/2 <P(x/L - m) (A4) 

and 

1>iA*) = [jj Wx/L - I) (A5) 

where variable x and parameter L have the dimension of length. Father functions 
ipjj and mother functions <f> m (x) } with integer j, /, m in — oo < j, /, m < oo form a 
complete, orthonormal basis in the space of functions with period length L. Therefore, 
a density distribution p(x) can be written as 

oo oo oo 

P( X ) = C m<t>m( X ) + PYl h,rtjj( X ) (A6) 

m = — oo j=0 l= — oo 
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insightful conversations. 



8 n into wavelet FFCs Sjj, and vice versa, il the realization is incomplete. Therefore, 
in terms of large scale structure study, the wavelet SSD 6 J: i is not a mathematical 
alternative to the Fourier expansion, but instead may reveal some physical features 
which are missed by 8 n . 

Since the basis of the wavelet transform are localized, while the Fourier transform 
are long-range coherent, the wavelet SSD and Fourier transform measure different as- 
pects of the density fields. The ergodic hypothesis essentially assumes that the spatial 
correlations are decreasing sufficiently rapidly with increasing separations. The vol- 
umes separated with distances larger than the correlation length can be considered 
as statistically independent regions. Such volumes can then be treated as indepen- 
dent realizations. As we have shown in eq.(22), such independence might easily be 
described by the FFCs. Therefore, wavelet SSD is probably more effective in picking 
up information from spatially incomplete samples. This point will become much more 
clear when we study the non-Gaussianity of the density distributions (Pando & Fang 
1995). 

Applying wavelet SSD spectrum analysis to samples of the Lya forests, some 
common features in their spectra have been revealed. They are: 1.) the magnitude of 
the 1-D spectra is significantly different from a Poisson process; 2.) the 1-D spectrum 
are flat on scales less than about 5 h -1 Mpc, and very slowly increase with scales 
larger than 5 h -1 Mpc; 3.) the constructed 3-D spectra have about the same power 
as the linear spectrum of the SCDM model on scales less than 40 h -1 Mpc, but larger 
than the SCDM model on scales larger than 40 h -1 Mpc; 4) the magnitudes of high 
redshift (z > 2.51) spectra generally are larger than those of low redshift (z < 2.51) 
results. Points 3) and 4) are probably caused by large geometric biasing on large 
scales and high redshifts. 

Both authors wish to thank Professor P. Carruthers, and Drs.H.G. Bi and P. Lipa 
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by small scale terms in P?,(k) (Kaiser & Peacock 1991). Therefore, P(k) does not 
contain information of P?,(k) on scales larger than the bending scale. It is impossible 
to reconstruct the bending of the 3-D spectrum from 1-D samples. 

5. Conclusion 

We showed that the wavelet SSD is an efficient and reliable tool for detecting 
the spectrum of density perturbations. For samples of objects tracing the density 
distribution, the spectrum of density perturbations can be perfectly reconstructed. In 
this method, no mean density is needed in detecting the spectrum, and the problem 
of complex geometry of the samples can also be overcome since the wavelet transform 
bases are always orthogonal and localized, regardless the geometry of the samples. 
Therefore, the wavelet SSD has great potential in detecting the spectrum of 2-D and 
3-D samples. 

In this paper, we always followed the convention that the spectrum is given by 
the decomposition with respect to a Fourier bases. In fact, one can also call Pj or Pj ar 
the spectrum, but with respect to the wavelet bases ipjj. In principal, the spectrum 
of density perturbations can be described by any complete and orthogonal bases. 
The descriptions based on different sets of complete and orthogonal bases should be 
equivalent if we have an ensemble of realizations of the cosmic stochastic density 
field. However, no ensemble of realizations is available in cosmology. To overcome 
this difficulty, it is assumed that the cosmic density field is ergodic. According to this 
hypothesis, the spectrum can be obtained from the spatial average in one realization 
of the density distribution (Peebles 1980.) Yet, cosmological observations cannot 
even provide one realization. Observed samples of large scale distributions must be 
incomplete as they are constrained by, at the very least, the size of the horizon. 
Therefore, the information detected by different base sets is not completely equivalent. 
It can be seen from eqs.(15) and (17) that one cannot transfer the Fourier components 
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claimed at a 90% and higher confidence level by other 1-D sample statistics of pencil- 
beam redshift surveys of galaxies, QSOs, and CIV absorption line systems in the QSO 
spectrum (Mo, et al. 1992b; Deng, Xia k Fang 1994). 

4-2 3-D spectrum 

As in §3.3, we constructed the 3-D spectra from the 1-D spectra of the LWT and 
JB samples. The results are given in Figures 8 and 9. We also plotted the linear 
spectra of the SCDM and CHDM in Figures 8 and 9. Obviously, one cannot directly 
test the models by the reconstructed spectrum because the formation of HI clouds 
underwent non-linear evolution and the identification of Lya absorption lines under- 
went selection effects. However, considering that high redshift Lya clouds are weakly 
non-linear, it should be interesting to compare constructed spectra with models. 

The reconstructed spectra have about the same order of magnitude as the the- 
oretical spectrum of the SCDM model on scales of < 40 h -1 Mpc, but larger than 
the model on scales > 40 h -1 Mpc (see Figures 8 and 9.) Namely, the differences 
between the theoretical and constructed spectra increase as the scale increases. Like 
the systematic differences seen in Figures 3 and 4, these differences may also be due 
to geometric biasing. It is our experience from cluster identification (PF) that the 
larger scale clusters of Lya clouds always locate where the amplitude of the MFCs 
is higher on smaller scales. Therefore, the larger the scale, the larger the effect of 
geometric biasing. 

The reconstructed 3-D spectra do not show peaks or bending on a scale of about 
100 h -1 Mpc, which is expected from the models of either SCDM or CHDM. This 
should not be a surprise. When the scale is less than the bending scale 100 h -1 Mpc, 
one can use eq.(39) to reconstruct the 3-D spectrum from 1-D spectra. However, when 
the scale is larger than the bending scale, eq.(39) will no longer hold. In this range, 
P 3 (&) oc k a , with a > 0, and eq.(38) shows that 1-D spectrum P(k) is dominated 
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spectrum magnitude is increasing as the redshift decreases. However, one should not 
directly identify the redshift-dependence of the reconstructed spectrum magnitudes 
with the result of the density perturbation evolution because, as discussed in §3.3, the 
geometric biasing effect has significant impact on the magnitude of the spectrum of 
the selected objects. 

One can also see from Figures 5-7 that all the spectra are rather flat on the 
range of log k > 0, or scales less than about 5 h -1 Mpc. This is consistent with the 
fact that no power of Lya line-line correlations functions has been detected from these 
two compiled samples. On scales larger than 5 h -1 Mpc, the spectra slightly increase 
with the increase of scale. However, this increase cannot be detected by two-point 
correlation function because the correlation function is dominated by noise on such 
large scales. 

Other studies have detected several length scales in the distribution of Lya forests, 
including: 1.) 40 h" 1 Mpc of a void Crotts (1989); 2.) 30-50 h" 1 Mpc from K-S 
statistic (Fang 1991); 3.) 80, and even 120 h -1 Mpc from typical scale analysis (Mo, et 
al. 1992a). However, from these results one cannot conclude whether the distribution 
of Lya forest lines has multiple typical scales because the different scales may come 
from the particular method being used. The wavelet SSD uniformly decomposes 
samples into all scales. Therefore, one can objectively study if the distribution truly 
has multiple typical scales. Figures 5-7 show that no typical scale of 30-50 h -1 
Mpc exists in the spectrum. Of course, the spectrum detected by the wavelet SSD 
is the average on range A log k = 0.3, and therefore, will overlook the features of the 
spectrum with width A log k <C 0.3. 

The only possible typical scale that can be seen in the spectra is 60 - 120 h -1 Mpc. 
Most spectra in Figures 5-7 appear to be flat, even dropping at log k ~ — 1 to — 1.3, 
which is about the same as that given by Mo et al (1992a). This scale has also been 
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z < 2.51. The error bars come from the average over the samples of QSO's absorption 
spectrum. 

The data is such that the constructed spectra are uncertain by a factor of about 
10. Nevertheless, some interesting results can already be recognized. The spectra 
obtained from the two independent data sets, the W > 0.36A LWT and the W > 
0.32A JB, show almost the same features, including the magnitudes, the ^-dependence 
and the ^-dependence (see Figures 5 and 6). Therefore, it would be reasonable to 
consider these features as common properties of the spectra of Lya forests. 

The order of magnitude of the spectra shown in Figures 5 - 7 is P(k) ~ 0.3 — 10 
h -1 Mpc. In computing the spectra for the entire redshift range the finest scale is 
taken to be J = 9, i.e. the number of bins is N = 2 9 = 512. For the spectra of the 
divided redshift ranges, J = 8, and N = 256. As expected, the magnitudes are not 
affected by the selection of J. 

One can compare this magnitude with that of a Poisson process. If the 1-D 
distribution of the Lya forests is given by a Poissonian process, the spectrum should 
be a white noise spectrum and its magnitude is (Vanmarcke 1983) 

Ppoisson(k) = h" 1 Mpc (41) 

47TiV 

where D h -1 Mpc is the spatial range of the sample. In our case, D = D max — P> m in 
for J = 9, and D = (D max — D m i n )/2 for J = 8, therefore Pp i sson ~ 0.15 h -1 Mpc. 
This magnitude is less than the mean of P(k) of the real data by a factor of 6. The 
difference between P(k) and Pp ison(k) is larger than 2<r, where 2<r is the variance 
of P(k). Therefore, the distribution of Lya clouds is significantly different from a 
Poissonian process. 

The magnitudes of the z > 2.51 spectra generally are larger than those of the z < 
2.51 results, and the whole redshift range spectra falling in between the two redshift 
ranges (Figures 5-7). Apparently this result conflicts with spectrum evolution: the 
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As in our first paper (PF), we look at two data sets of the Lya forests. The 
first was compiled by Lu, Wolfe and Turnshek (1991, hereafter LWT). It contains ~ 
950 lines from the spectra of 38 QSOs that exhibit neither broad absorption line nor 
metal line systems. The second is from Bechtold (1994, hereafter JB), which contains 
a total ~ 2800 lines from 78 QSO's spectra, in which 34 high redshift QSOs were 
observed at moderate resolution. In our statistics, the effect of proximity to z em has 
been considered. All lines with redshift z em > z > z em — 0.15 were deleted from our 
samples. We assumed q = 1/2, so the samples covers a comoving distance from about 
£> mm =2,300 /^Mpc to D max =3,300 /^Mpc. 

A problem in using real data to do statistics is the complex geometry of QSO's 
Lya forests. Different forest covers different spatial ranges, and no one of forests 
distributes on the entire range of (D m i n} D max ). This is a difficulty in detecting the 
spectrum by usual methods. At the very least, a complicated weighting scheme is 
needed. However, this problem can easily be solved in the analysis of the wavelet 
SSD. For a forest sample in a range (Di 7 D 2 ) } we can make it to be a sample in the 
range (D mtn} D max ) by adding zero to the data in ranges (D mm ,£>i) and (D 2} D max ). 
Since wavelets are local, the FFCs in the range (Di 7 D 2 ) will not be affected by the 
addition of zero in the ranges of (D m i n} Di) and (D 2} D max ). We can then compute any 
statistic by simply dropping all FFCs, tpj^ with coordinates / in the added zero ranges. 
Using this technique, geometric complicated samples can be regularized. Therefore 
all QSO samples can be treated uniformly, and no geometric weighting is needed. 

4.1 1-D spectrum 

The 1-D spectra determined from Lya forests are shown in Figures 5, 6 and 7. 
The spectrum are calculated from the LWT data sets with line width W > 0.36A, and 
from JB data set with W > 0.32A and W > 0.16A. For each data set, we computed 
three spectra: a) the entire redshift range 1.7 < z < 4.1, b) redshift z > 2.51 and c) 
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This is partially because the larger scale structures form later, and small structures 
earlier. In our first paper (PF) we found that the number density of the larger scale 
clusters of Lya lines increases as the redshift decreases. Therefore, the formation of 
small scale structures is mainly determined by the spectrum at the larger redshifts 
(the lower part of gray band), while the larger scale structures are determined by the 
lower redshift (or the upper part of the gray band.) 

^From Figures 3 and 4 one can see that the power of the reconstructed spectrum 
of the CHDM is lower than the corresponding SCDM. This is expected because the 
power of the CHDM spectrum is lower than the SCDM. 

It is more interesting to note that the power of the reconstructed CHDM spectrum 
is significantly larger than the theoretical spectrum. This is likely due to the bias of 
the selections. Because the clustering of CHDM is weak at high redshifts, the clouds 
identified as Lya absorbers are rare events, i.e. only the relatively high peaks in the 
density field are selected. On the other hand, the high peaks in the simulations of the 
SCDM model have been removed (step 3), therefore the Lya clouds contain relatively 
few high peaks. As a consequence of selecting the high peaks, geometric biasing should 
be considered. The bias effect can also be seen from Figure 3b, which shows that the 
power of SCDM sample without width selection is lower than the W > 0.16A lines, 
since the latter is more rare than the former. 

The reconstruction at largest scale (log & = - 0.75) contains more uncertainty 
due to the approximation of eq.(39), in which we describe the entire 1-D spectrum 
P(k)j by a power law. Actually, the reconstructed 1-D spectrum P(k)j should not 
be described by a power law spectrum with the same index on both small and large 
scales. Because the lack of data on the larger scales, one cannot calculate P?,(k) by 
the exact formula (38). 

4. Spectrum determined by Lya forests 
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13 redshift bins, each with Az = 0.2 and centered at z„ = n x 0.2 + 1.3 with n = 1 
to 13. For redshift bin n, the spectrum is taken at redshift z n . Therefore, one cannot 
use these samples to reconstruct spectrum on scales larger than redshift range of 
Az = 0.2. 

We subjected these samples to a SSD spectrum analysis by the D4 wavelet. The 
3-D spectrum P 3 (k) can be determined from 1-D spectrum P(k) by (see BGF) 

poo 

P{k) = 2tt / P 3 (q)qdq , (38) 

J k 

In deriving eq.(38), we have assumed that the random field is statistically isotropic. 
If the 1-D spectrum can be approximated as a power law P(k) oc k~ a , and a > 0, the 
3-D spectrum is given by 

log P 3 (k) = log P(k) - 2 log k + log(a/27r) (39) 

Using eq.(34), the wavenumber now is related to j by 

k = 1.86 • 2tt^ h Mpc" 1 (40) 

where D is spatial range of the samples in units of h -1 Mpc. From eqs.(39) nd (40) 
one can compute 3-D spectrum P 3 (k) from 1-D reconstructed spectrum P(k) r 

The reconstructed 3-D spectrum of the SCDM is shown in Figure 3, and the 
CHDM in Figure 4. The data and 1 a error bar are found from the average over 20 
samples in each model. The linear spectra of the SCDM and CHDM used for the 
simulation are plotted as a gray band in Figures 3 and 4, respectively. The center line 
of the gray bands is the power spectrum at z = 2.8, and the lower and upper edges 
are the spectra at z = 4.1 and 1.7, respectively. 

In the case of the SCDM (Figure 3) the reconstructed spectrum generally agrees 
with the theoretical spectrum with the difference that power of the reconstructed 
spectrum shows a faster increase than the model's spectrum when the scale increases. 
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shown in Figure 2. The peak and the amplitude ol the power spectrum are perfectly 
detected by the wavelet SSD. Therefore, the statistics of Pj or Pj ar can effectively 
provide information of the shape of the spectrum as well as its amplitude. 

3.3 Simulation samples of Lya forests 

Now we examine the spectrum of the samples given by a simulation of Lya 
forests (Bi, Ge & Fang 1995, hereafter BGF). These samples have also been used 
for the demonstration of cluster identification by the wavelet SSD (Pando & Fang 
1996). The simulation was done by the following procedures: 1) generate dark matter 
distributions by Gaussian perturbations with linear power spectrum of the standard 
cold dark matter model (SCDM), the cold plus hot dark matter model (CHDM), and 
the low-density flat cold dark matter model (LCDM); 2) generate the baryonic matter 
distribution by assuming that baryonic matter traces the dark matter distribution on 
scales larger than the Jeans length of the baryonic gas, but is smooth over structures 
on scales less than the Jeans length; 3) remove collapsed regions from the density field 
because Lya clouds are probably not virialized; 4) simulate Lya absorption spectrum 
as the absorption of neutral hydrogen in the baryonic gas, and include the effects 
of the observational instrumental point-spread-function, and along with Poisson and 
background noises; 5) determine the Lya absorption line and its width from the 
simulated spectrum by the usual way of Lya line identification. 

Obviously, the samples of the simulated Lya forests do not only depend on the 
theoretical spectrum mentioned in step 1), but depend also on the selection effects 
referred to in steps 2) - 5). One can expect that the reconstructed spectrum will not 
completely match the theoretical spectra because the simulated spectra are distorted 
by these selection effects. 

The simulated samples cover a redshift range of 1.7 to 4.1. However, in order to 
consider the redshift-dependence of the spectrum, the samples are synthesized from 
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on a scale range ol log k — > log k + A log k and A log k = log 2 = 0.301. On the other 
hand, the density of the FFT data in k-space is determined by the number of modes, 
and therefore the distribution of the Fourier spectrum data are uniform with respect 
to k, not log k. The data points are dense at large log k, but rare at small log k. 

3.2 Normalization factors 

^From eqs.(28) and (20), one can find two useful relations. They are 

logP(% =logP j -(log2)j + A (33) 

and 

log & = (log 2)j - log iv/27r + 5 (34) 

Eqs.(33) and (34) transfer the wavelet spectrum Pj or Pj ar into the mean Fourier 
spectrum P(k) 17 and vice versa. The factor A normalizes the amplitude of log Pj with 
log P(n)j, 

A= -log(2An p |V<n p )| 2 ) (35) 

The factor B normalizes the scale j with log k, 

B = log n v . (36) 

Obviously, the constants A and B depend on the basic wavelet being used in the 
SSD analysis. For instance, in the case of D4 wavelet, A = 0.602, and B = 0.270. 
We tested these normalizations by the following spectrum 

= iTTo*' (37) 

This spectrum has a peak at log k ~ —1.37, or a typical scale at 1/k = 23.4 (length) 
units. Using (37), we produced samples of distributions over L with bin numbers of 
256, 512 and 1024, and the bin size is 2tt units. The reconstruction of spectrum (37) is 
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where P(n)j is the average of Fourier spectrum on the scale j 

-, (n p +0.5An p )2J 

P Wi = VA^ E PW ' (29) 

P n=(n p -0.5An p )2J 

5. i Power index 



For a power law spectrum, we have P(an) = a 7 P(n), a being any constant, and 
7 the spectrum index. Because eq.(29) now gives P(n) 1+ i = 2 1 P(n) 1} eq.(28) shows 
that 

Pj oc 2 J ( 7+1 ) (30) 

or 

log 2 ^' = (7 + l)j + const (31) 

Therefore, the slope of log 2 Pj, when plotted against j, is 7 + 1. The index of a power 
law can be directly found by 

7 = dl °*> Pi - 1 (32) 
dj 

Figure 1 shows a simple example of a power law spectrum P(k) = k~ 2 } where 
k = 27rn/P is the wavenumber in a (length) unit. Using this spectrum we generated 
distributions over L with bin number 2 9 = 512, and the bin size is 2tt units. The 
FFT spectrum reconstruction is plotted in Figure la. Figure lb is the spectra log 2 Pj 
and log 2 Pj ar against j. The points of log 2 Pj ar in Figure lb has been shifted down 
to log 2 Pj ar — 1 for presentation purposes only. As expected, Figure lb shows 1.) 
Pj is equal to Pj ar ; and 2.) the slopes of the lines log 2 Pj — j or log 2 Pj ar — j are 
-2 + 1 = -1. 

Comparing Figures lb with la one can see that the points of the wavelet spectrum 
P(k)j uniformly distribute in log 10 k space. Each P(k)j measures the spectrum P(k) 
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ij)(n) = — [sin(7rn) — i cos(7m)] sin (7m/2) (25) 



The corresponding basic function is 

C 1 if < T] < 1/2 
4){ri) =\ -1 if 1/2 < V < 1 (24) 
[ otherwise 

This is just the Haar wavelet [eq.(A3)], whose Fourier transform is given as 
2 

7m ' 

When n <C 1, ip(n) ~ — z(7r/2)n. Therefore, function (24) is not compactly supported 
in Fourier space. In the case of a power law spectrum, 8 n = An 1 , eq.(15) gives 
/2A" 1/2 . 

8 hl = — S n if;(-n/2 J )e l27rnl/2J + terms n > 2 J 

n<2l \ L ' 

l/2 

= E (7) ^n^'e^ 23 + terms n > V. (26) 

Therefore, when 7 < —1, the large scale (small n) perturbations will significantly 
contribute to, and even dominate, the 6 h i. In this case, the variance cr 2 (/) of cubical 
cell CIC on scale / will seriously be contaminated by perturbations on scales larger 
than /, and therefore no longer be a measure of the spectrum at /. 

3. Demonstration of spectrum reconstructions 

^From eq.(10), the spectrum Pj of the wavelet SSD can be detected by the mean 
square of the FFCs. It can also be measured by the variance of FFCs, 

l 1=0 

where 6 J: i is the average of 6 J: i over /. Because the mean of FFCs, 6 J: i is zero [eq.(15)], 
Pj is equal to Pj ar . Therefore, the relationship between the spectra in the Fourier 
analysis and the wavelet SSD is [eq.(22)] 

p ( n )^2i^ p ^M\~ 2p r ( 28 ) 
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where \n p \ are the positions of the peaks of ip(n). 

Because both ip(x) and 6(x) are real, we have ij)(—n p ) = ij)*(n p ) and S_ n = 6*. 
Eq.(19) then becomes 

j (n p +0.5An p )2J 

fe| 2 ±\2 £ Re{^n p )6 n e^}\ 2 

n=(n p — 0.5 An p )23 
j (n p +0.5An p )2^ 

~ -|V>(n p )| 2 |2 S n cos(e^ + e n + 2Trnl/2 J )\ 2 (21) 

n=(n p — 0.5 An p ) 23 

where 6 n are the phases of ip(n p ) and 5 n , respectively. In the case of Gaussian 
perturbations, the distribution of 6 n is random. Therefore, eq.(21) reduces to 

j (n p +0.5An p )2J 

l^l 2 ^ 27 nlV'MI 2 E Kl 2 - (22) 

n=(n p — 0.5 An p ) 23 

Eq.(22) shows that the FFCs, |<5j,/| 2 , are /-independent. This means that different / 
of |<5j,/| 2 can be treated as independent realizations of stochastic variable of \& n \ 2 with 
n ~ n p 2 J . Using the language of the CIC statistic, one can say that each / is a cell, 
and 6 h i is the "count" in the cell /. Therefore, the square average of FFCs and its 
variance over the / "realizations" is a measure the spectrum P(n) = \6 n \ 2 . 

One can use this result to study the conditions under which CIC is a reasonable 
approach. First, we note that the CIC assumes that the counts in cells with various 
scales are a scale decomposition. However, this can be guaranteed only if the cells, 
or window functions, with different scales are orthogonal. The CIC window functions 
play a role similar to the mother functions in the wavelet SSD. Yet, as we know from 
discrete wavelet analysis, the mother functions are not orthogonal on different scales. 
Generally, one cannot find cell functions that form an orthogonal set. Therefore, the 
MFCs, and then the CIC, are scaled-mixed. 

Secondly, the cubic cell CIC corresponds to a window of the form 

= { otherwi^ 1 ( 23 ) 
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^From expansions (3) and (6), one can also express the Fourier coefficient, 6 n , in 
terms of FFCs as (see Appendix A. 3) 

1 oo 2J-1 

^tEEWM ( 17 ) 

^ j=0 1=0 

or 

oo 2J-1 , i x 1/2 

= £ £ ojy ^ e - 8W/2 ^(^/2 J ), n ^ 0. (18) 

j=o ;=o ^ LJLJ 

Eqs.(15) and (18) are the basic equations for detecting spectrum by a wavelet SSD. 

2.4 Detection of spectrum 

The father functions, compactly supported in Fourier space. For in- 

stance, the Fourier transform ij)(n) of the Battle-Lemarie wavelet, which is constructed 
with 4th order spline functions, is non-zero only in two symmetric narrow ranges cen- 
tered, respectively, at n = +1 and —1 with widths An <C 1. For the Daubechies 4 
(D4) wavelet, ij)(n) also have two symmetric peaks with centers at n = in p and with 
width An p . Therefore, the sum over n in eq.(15) should only be taken on two ranges of 
(n p -0.5An p )2 J < n < (n p +0.5An p )2 J and -(n p +0.5An p )2 J < n < -(n p -0.5An p )2 J . 
Eq.(15) can be approximately rewritten as 

^ " I27J X 

(n p +0.5An p )2J -(n p -0.5An p )2J 

m-n P ) £ 8 n j 2 ™ 1 ! 2 ' + Un P ) E 8 n e^ v ] (19) 

n=(n p — 0.5 An p )23 n= — (n p -\-0 .5 An p )23 

,rvl/2 (n p +0.5An p )2i 

= (I) E W-M^ 2 ™^ + ^M^e"^]. 

ri=(ri p — 0.5 An p ) 23 

Eq.(19) shows that the FFCs on scale j are mainly determined by the Fourier 
components 8 n with n centered at 

n = n p 2 3 , (20) 



Comparing eqs.(9) and (4), one can relate the term ^Cfco 1 \^i,i\ 2 1 ^ ^° ^ ne P ower °f 
perturbations on length scale L/2 3 , and the term \6 J: i\ 2 /L to the power of the per- 
turbation on scale L/2 3 at position IL/2 J . Therefore, the spectrum with respect to 
wavelet bases can be defined as 

^ = 7EI*/ (io) 

2.3 Relationship between 6 n and 6 J: i 

Substituting expansion (2) into eq.(8), we have 

00 poo 00 

hi= E M e x2 ™ x l L ^{x)dx = J2 tnfaii-n) ( n ) 

n= — 00 00 n= — 00 

where is the Fourier transform of ijjj^x), i.e. 

A /*°° 

feW = / ^ h i(*)z~ l2vnxlL dx. (12) 
-00 



Using eq.(7), one can rewrite eq.(ll) as 
00 f2 3 \ 1 ^ 2 r 00 

= E [j] S nj j^ nx l L ^{2 3 x/L-l)dx (13) 



n= — 00 



Defining variable rj = 2 3 x/L — l } one find 



s 



00 /o,\- 1 /2 



E (f J S n ^ v J" ^l v ^{fi)dfi (14) 



or 



00 /o?\ ■'■/^ 



E (t) M(--/2 J )e^ /2J (15) 

7i= — CO \ / 

where if>(n) is the Fourier transform of the basic function ^(rj) 

^(n) = / i){ri)e- l2vnJ1 dri. (16) 

Eq.(15) is the expression of the FFCs in terms of the Fourier amplitudes. 
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which shows that the perturbations can be decomposed into domains, n, by the or- 
thonormal Fourier basis functions. The power spectrum of perturbations on length 
scale L/n is then defined as 

P(n) = \S n \ 2 . (5) 

2.2 Spectrum with respect to wavelet base 

To subject the density contrast 6{x) to a wavelet expansion, we first assume that 
6{x) is an L periodic function defined on space — oo < x < oo. A wavelet expansion 
of 6{x) is given by (Daubechies 1992, see also Appendix A. 2) 



oo oo 



s ( x ) = J2J2 s j,^i,i( x ) ( 6 ) 

j=0 /= — oo 

where ipjj(x) is the base function, also called father function, defined as 
/ 2 A 1/2 

^•,,(x)=^-J ^x/L-l). (7) 

The real function ifj(r]) is the basic wavelet which is localized in a range of rj = to 
T) = 1 with center at rj = 1/2 (Appendix A.l). Eq.(7) shows that the father functions 
ipjj(x) are generated from the basic function ip(x/L) by a dilation 2 J and a translation 
/. Therefore, the father functions ipjj(x) have scale L/2 J and are centered at IL/2 J . 
The bases ipjj(x) are complete and orthonormal with respect to both indexes j and 
/. Therefore, the FFCs, Sjj, in eq.(6) are given by 



6 h i = / 8(x)il) h i(x)dx. (8) 

J — oo 

Similar to the Fourier expansion, the Parseval theorem for the expansion (6) is 
(see Appendix A. 2) 

-/ |%0l 2 ^ = E 7 EM 2 - (9) 



sions. Therefore, one can reconstruct the spectrum from the statistics of the mean 
of the FFCs and the variance (Yamada & Ohkitani 1991). We apply this method to 
simulation samples and real data of the Lya forests. 

The contents of the paper are arranged as follows: In §2 and in the Appendix, 
we describe the method, and derive all formulae needed for the determination of 
the spectrum via discrete wavelet SSD. §3 demonstrates the wavelet SSD spectrum 
measure, including reconstruction of power law spectrum, determination of typical 
scales, and illustration of selection effects by the simulation samples of Lya forests. 
In §4 we apply the method to real data of Lya forests; both 1-dimensional (1-D) and 
3-dimensional (3-D) spectra are found. §5 contains discussions and conclusions. 

2. Wavelet SSD and spectrum of density field 

2.1 Spectrum in Fourier analysis 

Without loss of generality, we consider a 1-D density field p(x) over a range 
< x < L. It is convenient to use the density contrast defined by 



where p is the mean density in this field. To express 6(x) as a Fourier expansion, we 
take the convention 





P 



CO 





with the coefficients computed by 




(3) 



Parseval theorem for the Fourier expansion (3) is 




(4) 
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mean density of the sample. It is difficult, or sometimes even impossible, to accu- 
rately determine the mean density of objects because of the lack of information of the 
object's distribution on scales larger than the size of the samples being considered. 
The Fourier transform of weighted galaxy counts is based on the assumption that all 
density fluctuations are zero outside of the volume considered (Feldman, Kaiser & 
Peacock 1994.) If the density fluctuation field is a homogeneous random process, the 
average of Fourier amplitudes over an ensemble of the fluctuation fields with finite 
extent (zero outside) will be the same as that over an ensemble of the fluctuation field 
of infinite extent (Adler 1981). Unfortunately, no ensemble is available in cosmology. 
The effect of the finite size of samples can not be eliminated because the Fourier bases 
are delocalized. On the other hand, in a CIC analysis by a window of limited spatial 
support, the behavior of the perturbations on scales larger than the size of a sample 
does not play an important role. 

The problem with the CIC statistics is that its basis (windows) functions are not 
orthogonal. The variances obtained from the decomposition of cells with different 
scale / are not independent from each other. Hence, one cannot reconstruct the power 
spectrum by c 2 (/). Moreover, the cubical cell is not localized in Fourier space. We will 
show that this becomes a severe problem in the case where the power law spectrum 
has a negative index. 

Because of the above-mentioned problem, it is theoretically important to study 
the spectrum by means of a complete, orthogonal, and localized basis. The wavelet 
SSD is such a tool. The father function coefficients (FFCs) of the discrete wavelet 
SSD directly described the fluctuations of density fields on each scale. The variance 
of the FFCs on each scale is similar to the variance of the CIC. In addition, the ba- 
sis of the discrete wavelet transform are complete and orthogonal, and it is easy to 
find the relationship between the coefficients of the Fourier and the wavelet expan- 
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1. Introduction 



We have recently shown that a space-scale decomposition (SSD) analysis based 
on the discrete wavelet transform is a powerful tool in detecting structures in the 
spatial distribution of objects in the universe (Pando & Fang 1996, hereafter PF.) 
The spatial distributions on various scales can be systematically reconstructed from 
the mother function coefficients (MFC) of a wavelet based SSD. The clusters can then 
be identified, scale by scale, from these decomposed distributions. Using this method, 
the clustering and its evolution of QSO's Lya forest lines has been studied. The 
distributions of the wavelet identified clusters were found to be an effective statistical 
measure which can discriminate among models. Measures such as the number density 
of Lya absorption lines, line-line correlations, etc. fail to discriminate between models. 

We now continue our work in this direction. The central topic in this paper is 
the power spectrum of the density perturbation. We will study how to determine 
power spectrum of a density field by the wavelet SSD. The wavelet based measure 
of the spectrum is, in some sense, similar to the count in cell (CIC) method. CIC 
detects the variance a 2 of density fluctuations in windows of a cubical cell with side / 
or Gaussian sphere with radius Rq. It is believed that the variance in cell / is mainly 
contributed by the perturbation on scale ~ 21 or Rq. Therefore, the variances should 
be a measure of the power spectrum on scale / (Efstathiou et al 1990, Saunders 1991, 
Peacock 1991). 

An advantage of the CIC spectrum estimator is that the cells are localized. It 
reduces the uncertainties caused by a poor knowledge of long wavelength perturba- 
tions and by the finite size of the observational samples. All spectrum estimators 
based on the Fourier transform undergo an infrared (long-wavelength) uncertainty. 
For instance, the classical spectrum estimator, i.e. the Fourier transform of the au- 
tocorrelation function (Peebles 1980) depends essentially on a good measure of the 
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Abstract 

A method for measuring the spectrum of a density field by a discrete wavelet 
space-scale decomposition (SSD) has been studied. We show how the power spec- 
trum can effectively be described by the father function coefficients (FFC) of the 
wavelet SSD. We demonstrate that the features of the spectrum, such as the magni- 
tude, the index of a power law, and the typical scales, can be determined with high 
precision by the FFC reconstructed spectrum. This method does not require the mean 
density, which normally is poorly determined. The problem of the complex geometry 
of observed samples can also be easily solved because the basis are always orthogonal, 
regardless the geometry of the samples. Using this method, we examine the spectra 
inferred from Lya forests of both simulated and real samples. We find that f .) the 
magnitude of the f-D spectra is significantly different from a Poisson process; 2.) the 
f-D spectra are flat on scales less than about 5 h -1 Mpc, and show a slow increase 
with the scale in a range larger than 5 h -1 Mpc; 3.) the reconstructed 3-D spectra 
have about the same power as the COBE normalized linear spectrum of the SCDM 
model on scales less than 40 h -1 Mpc, but the larger than the SCDM model on scales 
larger than 40 h -1 Mpc; 4) the magnitudes of high redshift (z > 2.51) spectra gen- 
erally are larger than those of low redshift (z < 2.51) results. Points 3) and 4) are 
probably caused by large geometric biasing on large scales and high redshifts. 



Key words: Cosmology - spectrum of perturbation - Lya forest 
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